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We investigate the crystallization mechanism of a single, flexible homopolymer chain with short range at¬ 
tractions. For a sufficiently narrow attractive well, the system undergoes a first-order like freezing transition 
from an expanded disordered coil to a compact crystalline state. Based on a maximum likelihood analysis of 
committor values computed for configurations obtained by Wang-Landau sampling, we construct a non-linear 
string reaction coordinate for the coil-to-crystal transition. In contrast to a linear reaction coordinate, the 
string reaction coordinate captures the effect of different degrees of freedom controlling different stages of the 
transition. Our analysis indicates that a combination of the energy and the global crystallinity parameter Qq 
provide the most accurate measure for the progress of the transition. While the crystallinity paramter Qq is 
most relevant in the initial stages of the crystallization, the later stages are dominated by a decrease in the 
potential energy. 


I. INTRODUCTION 


Many polymers go through large-scale conformational 
changes akin to phase transitions when external pa¬ 
rameters like the temperature or the solvent properties 
changed. A particularly simple example for such a sys¬ 
tem is a homopolymer chain with short-range attractions 
and strongly repulsive cores^. The complex phase be¬ 
havior of this system has been studied previously in a 
number of different studies. First, it was shown by Tay¬ 
lor and Lipson^, as well as later by Zhou etal£, that 
the chain’s radius of gyration is a sigmoidal function of 
temperature. More recently, the entire phase diagram of 
the system as a function of temperature and interaction 
range was mapped out by Taylor, Paul, and Binder— 

Of particular interest to this study is the first-order like 
coil-to-crystal freezing transition, which occurs for chains 
with very narrow attractive wells. This transition has 
been studied recently by Ruzicka, Quigley, and Aller£ in 
forward flux sampling simulations of a slightly modified 
model to allow the application of collision dynamics. In a 
further study, the authors of this paper have investigated 
the freezing transition using transition path sampling^ in 
combination with likelihood maximization^ in order to 
search for a reaction coordinate^. Here, we build on this 
work and improve the quality of the reaction coordinate 
by substituting the linear version used so far with a non¬ 
linear string reaction coordinate^. 


The remainder of this paper is organized as follows. In 
Sec.iniwe define the polymer model and give a short sum¬ 
mary of its properties. Methods and simulation details 
are discussed in Sec. m We present results in Sec. □30 
and provide a discussion in Sec.lVl 



Figure 1. Coil (top left), crystalline (bottom right), and two 
intermediate states of the polymer chain for particle num¬ 
ber N = 128, interaction range A = 1.05 and temperature 
k^T/e — 0.438. Crystalline and coil-like particles are colored 
in red and yellow, respectively, while intermediate particles 
are colored in blue. The criterion for crystallinity used here 
is defined in Ref. m 

II. POLYMER MODEL 

The model used in this study is a single, fully flexible 
chain of N identical monomers with a short-range at¬ 
traction between monomers, as well as a hard repulsive 
core (Fig.[Tj) . Non-neighboring monomers interact via a 
smoothed variant of a square-well potential^, 


\ exp 

r-OR-a)! 

+ tanh 

b 

< 

1 

0$ 

1 _ 

a 

a 

l 



(i) 

where R is the distance between the monomers and A > 1 
parametrizes the width of the potential well. We have 
chosen a value of a = 0.002 a for the parameter which 
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Figure 2. Smoothed square-well potential for A = 1.05. The 
harmonic spring potential (acting between neighboring beads 
only) is also shown for comparison. 

determines the steepness of the exponential repulsion and 
the width of the step at R = Act. Neighboring monomers 
are coupled via harmonic springs U ( R ) = | (R — cr) 2 with 
a value of k = 20000 a 2 /e for the spring constant. The 
pair potential, as well as the harmonic potential between 
chain neighbors, is shown in Fig.O 

Depending on the value of the interaction width A, 
there exist three different phases. At high temperatures, 
the system is in the expanded coil phase for all interac¬ 
tion widths. What happens when the system is cooled, 
however, depends on A. For wide wells, the system first 
undergoes a second-order collapsing transition to a com¬ 
pact, but unordered globule phase. Further cooling leads 
to a first-order freezing transition to a crystalline state. 
For sufficiently small values of A (A < 1.05 in the case 
of N = 128), the system directly freezes from the coil to 
the crystalline state without going through the molten 
globule phase. A detailed description on the chain’s 
phase behavior is given in the work of Taylor, Paul and 
Binder— as well as our own recent studyii. In the 
latter work, we also show that the phase behavior of the 
smoothed version of the chain is very similar to that of 
the original square-well chain. For simulations presented 
and discussed in this paper, we have chosen the TV = 128 
chain with an interaction length of A = 1.05. This sys¬ 
tem undergoes a direct freezing transition from the coil 
to the crystalline state, with a coexistence temperature 
of k^T/e — 0.438 =b 0.001—. An illustration of these two 
states is shown in Fig. [I] 

III. METHODS 

A. Definition of the stable states 

In this paper we study the freezing of the polymer. In 
accordance with our previous study of this systemii, we 
define the expanded coil as stable state A and the crystal 


as stable state B. The distinction is made based on the 
potential energy of the system. A configuration is con¬ 
sidered to be in the coil state if U/N > U m i n /TV = — 0.7 e 
and it is considered to be in the crystalline state if 
U/N < U max /N = —2.6 s. Note that in our model the 
potential energy is essentially proportional to the number 
of close contacts between non-neighboring monomers. 


B. Wang-Landau sampling 

Using a Wang-Landau simulation^ we have obtained 
a uniform sample of states outside the two stable basins 
A and B. With this technique, one iteratively constructs 
the density of states of the system by performing a Monte 
Carlo simulation with the inverse of the current estimate 
of the density of states as acceptance criterion. Once 
the simulation is converged, each energy interval of a 
given fixed width is visited with equal frequency. To 
speed up the simulation we have combined several types 
of Monte Carlo Moves including the bond-bridging move 
introduced in Ref.@. Further details of the Wang-Landau 
procedure, as well as the Monte Carlo moves used in the 
simulation, are given in Ref. U 


C. Committor analysis 

For a system with two (meta-) stable states A and 
T>, the committor p#(x) of configuration x is the frac¬ 
tion of dynamical pathways started from x that first 
reaches state B -. To compute the committor for a par¬ 
ticular x, one launches a number of trajectories start¬ 
ing with random momenta from x and counts the frac¬ 
tion of trajectories ending in B. Our committor calcu¬ 
lations were performed according to the algorithm de¬ 
scribed in Ref.Qj], using TV min = 100 and 7V max = 500. 
Since we are interested in the true mechanism of the 
coil-to-crystal transition, the procedure used to obtain 
trajectories for the calculation of committor values must 
resemble the natural dynamics of the system. If Monte 
Carlo dynamics is considered, this implies that only lo¬ 
cal moves can be used. Molecular dynamics provides a 
more physical (and computationally more efficient) way 
to model the time evolution of the system. We there¬ 
fore use the smooth (differentiable) potential of Eqn. (pQ) 
to facilitate such simulations and avoid the cumbersome 
handling of impulsive forces caused by the discontinu¬ 
ities in the original square-well potential. To evolve the 
system in time, we em ploy La ngevin dynamics with a 
time step At = 0.0002 y/mcr 2 je and a damping constant 
7 = 0.5 m 3 / 2 cr 2 6 -1 / 2 , where the actual integration is per¬ 
formed using the Langevin thermostat by Schneider and 
Stolid implemented in a modified version of LAMMPSi^. 
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Figure 3. Schematic representation (data not from simula¬ 
tion) of S M (q), for a string of length M — 5. The point q is 
first mapped to a point on the string S(q). This point is then 
mapped by a(S) to a number between 0 and 1 to obtain the 
progress along the string. 


D. String reaction coordinate 

To identify a valid reaction coordinate, we use the non¬ 
linear reaction coordinate analysis of Lechner et al In 
this approach, a reaction coordinate is constructed as a 
projection of a configuration on a piecewise linear string 
defined by a sequence of M points in an m -dimensional 
order parameter space. More formally, the reaction coor¬ 
dinate r associated with a point x in configuration space 
is defined as 

r(x) =/(a{S M [q(x)]}). (2) 

Here, we have a sequence of projection operations. 
q(x) maps the 3Wdimensional configuration to a low¬ 
dimensional order parameter space, so q = { q\ ,..., q m } 
is a vector of order parameter values. In this work, we 
restrict ourselves to m = 2, in other words, the string 
resides in a two-dimensional plane. S M (q) is the pro¬ 
jection onto the string, schematically illustrated in Fig. [3] 
for a plane of two generic order parameter q\ and , and 
a( S M ) is the mapping of the string point to a number be¬ 
tween 0 (state A, start of the string) and 1 (state H, end 
of the string). We use the geometric projection described 
in the work of Rogal et. alii. Finally, f(a) is a monotonic 
(cubic) spline that maps a to the reaction coordinate. 

The committor pb is assumed to be a sigmoidal func¬ 
tion of the model reaction coordinate, 

Pb{t) = I[1 +tanh(r)]. (3) 

The reaction coordinate, defined by the location of the 
string points, the relative scaling of the involved order 
parameters and the functional form of the projection, is 
constructed such that the likelihood 
B A 

L = Y[pB(r^)I[ll-PB(r^)} (4) 

k k' 

is maximized. The first product runs over all single 
shooting events ending in state H, while the second prod¬ 
uct runs over all shooting events ending in state A. The 


likelihood quantifies the compatibility of the proposed 
model with observed outcome of the shooting events. We 
use the Bayesian information criterion^ 

BIC = -2 InL + k(M) ln(n) (5) 

to compare the optimization results for different numbers 
of optimization parameters, where smaller BIC values are 
better. Here, n is the total number of observations, i. e., 
the total number of shooting events entering in Eq. (|4]) , 
and k(M) is the number of free parameters entering the 
model. The BIC penalizes models with too many free 
parameters, hence it is used to check whether it is sen¬ 
sible to add additional physical parameters to improve 
the model reaction coordinate. The first coordinate, in 
our study the polymer’s potential energy, is used to dis¬ 
tinguish the two stable states A and B, therefore, the 
end points of the string are held fixed at the stable state 
boundaries and can only move in the orthogonal direc¬ 
tion. The inner points of the string can move in any 
direction. In addition, we use M equally spaced points 
between 0 and 1 to define the mapping /(a), as well as 
an additional variable for the scaling of the plane along 
the second variable relative to the first one. Therefore, 
k(M) = 3M - 1 for M > 2. 

The actual optimization of the string is carried out ac¬ 
cording to the algorithm described in Ref. m. The pro¬ 
cedure is a steepest descent scheme, where one move is 
either one of three different choices: 

1. a string move, where the string itself is altered by 
displacing the points of the string; 

2. a move where the mapping /(a), which translates 
the progress along the string to a reaction coordi¬ 
nate, is altered; 

3. a scaling move, where the weight of the second vari¬ 
able relative to the first (the energy) is changed. 

In all cases, the move is accepted if the (log) likelihood 
increases and rejected otherwise. 


E. Order parameters 

In addition to the potential energy, we have calculated 
a number of structural order parameters for the polymer. 
These are 

• the size of the core N core : the number of particles 
in the largest cluster of crystalline particles; 

• the total number of crystalline particles 7V cryst ; 

• the total number of compact particles Af comp , de¬ 
fined as all particles with six or more adjacent par¬ 
ticles (distance smaller than 1.05 cr); 

• the mean squared radius of gyration ; 

• the global order parameters Q 4 and Q& 
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Figure 4. (Top) Optimized strings with M — 2, 3, 5, 9 in the 
U-Qq plane. The blue rectangle shows the range of order 
parameter values of the polymer configurations considered in 
the procedure. The color map in the background represents 
the predicted committor from the M — 5 string. The col¬ 
ored dots are the real committor values of the configurations 
used in the optimization, with the same color code as the one 
used for the predicted values. (Bottom) Optimized strings 
with M — 2, 3, 5, 8 in the U-N cor e plane. The color map in 
the background represents the predicted committor from the 
M = 8 string. 

• the polymer’s moments of inertia 1 % = J m i n , h, and 

^3 — ^maxj 

• the anisotropy a — -^max/^min 1* 

For all these variables, we have constructed string reac¬ 
tion coordinates with 2 < M < 9 in combination with the 
potential energy V as first coordinate. The criterion for 
crystallinity used in the calculation of A cryst is defined in 

Ref. [hi 

IV. RESULTS 

We have used a total of 3912 configurations in the 
energy range —256 <U/e < —163 with known commit¬ 
tor values, corresponding to the left and right borders of 
the blue rectangles in Fig.0J for the construction of the 
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Figure 5. Cubic spline mapping functions r = /(a) belonging 
to the four strings in the top panel of Fig. [4] 

string reaction coordinate. The string with M = 2 corre¬ 
sponds to the linear model reaction coordinate as intro¬ 
duced by Peters and Trout3, so as a first test, we have 
checked that the results obtained for the M = 2 string 
are in agreement with the results of our implementation 
of the linear optimization procedure. Apart from some 
discretization error due to the way the committor data 
are handled in our string coordinate code, these two like¬ 
lihood scores agree well. The score for the linear version 
also serves as a baseline to compare the likelihood score 
for more complex strings. In particular, for any combina¬ 
tion of variables, the likelihood score obtained by a string 
with M > 3 should be greater than the corresponding 
likelihood for the optimized linear reaction coordinate. 
It is worth noting that the main computational effort in 
the construction of the reaction coordinate goes into the 
calculation of committor values and has to be performed 
on a cluster with many computing cores. In contrast, 
the string optimization procedure, even for many com¬ 
binations of variables, can be done at comparatively low 
computational cost on a single workstation. 



BICmin 

In L 

M 

Qq 

231632 

-115725 

5 

A C ore 

237303 

-118502 

8 

Acryst 

237720 

-118711 

8 

h 

240494 

-120117 

7 

h 

246363 

-123052 

7 

h 

247179 

-123421 

9 

Q4 

254593 

-127166 

7 

Rl 

258044 

-128892 

7 

a 

263911 

-131787 

9 

Acomp 

266048 

-132894 

7 


Table I. Optimum BIC scores and the corresponding likeli¬ 
hoods, as well as the string length M at which the optimum 
was achieved, for all the investigated collective variables. Note 
that smaller BIC values are better. 
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Figure 6. Logarithmic likelihood (filled symbols, left scale) 
and BIC (open symbols, right scale) as a function of string size 
M for five independent runs of the optimization procedure, 
with the potential energy U and Qq as variables. The minimal 
BIC is reached for M — 5. The likelihood score for the linear 
model reaction coordinate is also shown for comparison. 


In Table HJ we have listed the top likelihood scores and 
the corresponding BIC for the combination of the po¬ 
tential energy with all the other variables described in 
Sec. lmEl as well as the string length M at which this 
value was reached. Note that the minimal BIC is al¬ 
ways reached at values of M > 2. In other words, in 
any case, the performance of the optimum string coor¬ 
dinate is better than the corresponding linear reaction 
coordinate, even after correcting for the higher model 
complexity introduced by the additional degrees of free¬ 
dom in the form of the coordinates of the string images 
as well as the parameters of f(a). Furthermore, as it is 
the case with the linear version of the reaction coordinate, 
the combination with the global order parameter Qq gets 
the highest score. Even with the added flexibility of the 
string, no combination of the energy with any other vari¬ 
able than Qq performs better than ([/, Qq) even in the 
linear case. However, the combinations ([/, N core ) and 
(E7, N cryst ) come very close, indicating that the flexibil¬ 
ity of the string can compensate for the lower quality of 
the order parameter combinations up to a certain point. 
Similarly, the likelihood score for the combination with 
the squared radius of gyration R 2 g is comparatively low 
even for the best string coordinate. 

As an illustration, we have plotted four optimized 
strings in the U-Qq plane in Fig.|4j Shown in the same 
figure are four optimized strings in the U-N core plane, 
the variable combination that gave the second-best likeli¬ 
hood. Note that for larger values of M the strings are dis¬ 
tinctly curved, deviating strongly from the linear reaction 
coordinate studied earlier. Also shown as color map in the 
same figure is a comparison of the predicted committor 
values of the used configurations with the actual one. 
The mappings r(a) corresponding to the U-Qq strings 
are shown in Fig. [5] 
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Figure 7. The computed committor values plotted against the 
optimum model reaction coordinate as given by the M = 5 
string in the U-Qq plane. Shown in red is the ideal model 
committor of Eq. J3|) . 


The likelihood score, as well as the Bayesian informa¬ 
tion criterion, as a function of M for a number of in¬ 
dependent optimization runs—each with Qq as second 
variable—is shown in Fig.[6j The M = 2 scores corre¬ 
spond to the linear reaction coordinate. One can ob¬ 
serve that the score reaches a plateau for M = 5, in other 
words, it is sufficient to use a string with that length. Any 
further addition of string points only increases the model 
complexity without improving its accuracy, a fact which 
is also conveyed by the Bayesian information criterion. In 
Fig. [3 we have plotted this optimum reaction coordinate 
given by the M=5 string—as obtained by one of the op¬ 
timizations runs—against the real committor value. For 
a perfect fit, the real committor values would coincide 
with the hyperbolic tangent function shown in red in the 
same figure. 


V. DISCUSSION 

For sufficiently narrow attractive wells, the polymer 
chain investigated in this paper shows a two-state fold¬ 
ing transition from the expanded coil to the crystalline 
state. In the present work, using likelihood maximiza¬ 
tion, we have constructed a string reaction coordinate 
for this process. As already observed in our previous 
study using a purely linear reaction coordinate, the com¬ 
bination of the potential energy with the global order 
parameter Qq gave the best likelihood score. Due to the 
form of the pair potential, the potential energy of the sys¬ 
tem is basically a measure for the number of contacts be¬ 
tween non-neighboring monomers. Moreover, Qq , which 
is sensitive to closed-packed structures, adds information 
about the crystallinity of a given configuration. There¬ 
fore, it is sensible that a combination of these two pa¬ 
rameters will work rather well as a reaction coordinate, 
since during a typical folding transition, both the number 
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of contacts as well as the crystalline order will increase. 
Due to the curved nature of the string, our string re¬ 
action coordinate is able to follow this transition more 
closely than a purely linear reaction coordinate. This is 
what leads to the observed improvement in the likelihood 
score. In particular, upon following the string along the 
crystallization, one observes a change of behavior from 
the initial to the final stages of the transition. Initially, 
the system changes mainly by increasing its overall crys¬ 
tallinity as quantified by the Qq parameter. Presumably, 
this is due to the formation of a small crystalline core in 
the system. Later on in the crystallization, the strongest 
change is seen in the potential energy, caused by a steady 
growth of the initial crystalline core leading to more and 
more particles packed closely together. 

However, the total improvement of the quality of the 
found reaction coordinate is rather modest. It remains 
therefore a challenging task to identify better order pa¬ 
rameters as candidates in the construction of reaction 
coordinates. More specifically, these order parameters, 
while still being as symmetric as possible, should also 
take into account the order of particles along the polymer 
chain, which has been completely neglected so far. In the 
case of more complex polymers, for example if there is 
less energetic difference between unfolded and crystalline 
state, it might also help to work with the connectivity 
information in a more detailed way, rather than with the 
number of all connections alone. 
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